Parameterized DAWGs: Efficient constructions and bidirectional pattern searches
نویسندگان
چکیده
Two strings $x$ and $y$ over $\Sigma \cup \Pi$ of equal length are said to \emph{parameterized match} (\emph{p-match}) if there is a renaming bijection $f:\Sigma \Pi \rightarrow \Sigma that identity on $\Sigma$ transforms (or vice versa). The \emph{p-matching} problem look for substrings in text p-match given pattern. In this paper, we propose suffix automata} (\emph{p-suffix automata}) directed acyclic word graphs} (\emph{PDAWGs}) which the p-matching versions automata DAWGs. While DAWGs equivalent standard strings, show p-suffix can have $\Theta(n^2)$ nodes edges but PDAWGs only $O(n)$ edges, where $n$ an input string. We also give $O(n |\Pi| \log (|\Pi| + |\Sigma|))$-time $O(n)$-space algorithm builds PDAWG left-to-right online manner. As byproduct, it shown tree} reversed string be built same time space, right-to-left This duality leads us two further efficient algorithms p-matching: Given parameterized tree reversal $T$, one build $T$ offline manner; One perform \emph{bidirectional} $O(m (|\Pi|+|\Sigma|) \mathit{occ})$ using $m$ denotes pattern $\mathit{occ}$ number occurrences $T$.
منابع مشابه
Space-Efficient Dictionaries for Parameterized and Order-Preserving Pattern Matching
Let S and S′ be two strings, having the same length, over a totally-ordered alphabet. We consider the following two variants of string matching. Parameterized Matching: The characters of S and S′ are partitioned into static characters and parameterized characters. The strings are a parameterized match iff the static characters match exactly, and there exists a one-to-one function which renames ...
متن کاملEfficient Dynamic Dictionary Matching with DAWGs and AC-automata
The dictionary matching is a task to find all occurrences of pattern strings in a set D (called a dictionary) on a text string T . The Aho-Corasick-automaton (AC-automaton) which is built on D is a fundamental data structure which enables us to solve the dictionary matching problem in O(d log σ) preprocessing time and O(n log σ + occ) matching time, where d is the total length of the patterns i...
متن کاملParameterized Pattern Matching - Succinctly
The fields of succinct data structures and compressed text indexing have seen quite a bit of progress over the last 15 years. An important achievement, primarily using techniques based on the Burrows-Wheeler Transform (BWT), was obtaining the full functionality of suffix tree in the optimal number of bits. A crucial property that allows the use of BWT for designing compressed indexes is order-p...
متن کاملParameterized pattern queries
We introduce parameterized pattern queries as a new paradigm to extend traditional pattern expressions over sequence databases. A parameterized pattern is essentially a string made of constant symbols or variables where variables can be matched against any symbol of the input string. Parameterized patterns allow a concise and expressive definition of regular expressions that would be very compl...
متن کاملTriangular norms. Position paper II: general constructions and parameterized families
This second part (out of three) of a series of position papers on triangular norms (for Part I see [43]) deals with general construction methods based on additive and multiplicative generators, and on ordinal sums. Also included are some constructions leading to non-continuous t-norms, and a presentation of some distinguished families of t-norms.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Theoretical Computer Science
سال: 2022
ISSN: ['1879-2294', '0304-3975']
DOI: https://doi.org/10.1016/j.tcs.2022.09.008